Validity (statistics)

Validity is the main extent to which a concept, conclusion, or measurement is well-founded and likely corresponds accurately to the real world.^[1]^[2] The word "valid" is derived from the Latin validus, meaning strong. The validity of a measurement tool (for example, a test in education) is the degree to which the tool measures what it claims to measure.^[3] Validity is based on the strength of a collection of different types of evidence (e.g. face validity, construct validity, etc.) described in greater detail below.

In psychometrics, validity has a particular application known as test validity: "the degree to which evidence and theory support the interpretations of test scores" ("as entailed by proposed uses of tests").^[4]

It is generally accepted that the concept of scientific validity addresses the nature of reality in terms of statistical measures and as such is an epistemological and philosophical issue as well as a question of measurement. The use of the term in logic is narrower, relating to the relationship between the premises and conclusion of an argument. In logic, validity refers to the property of an argument whereby if the premises are true then the truth of the conclusion follows by necessity. The conclusion of an argument is true if the argument is sound, which is to say if the argument is valid and its premises are true. By contrast, "scientific or statistical validity" is not a deductive claim that is necessarily truth preserving, but is an inductive claim that remains true or false in an undecided manner. This is why "scientific or statistical validity" is a claim that is qualified as being either strong or weak in its nature, it is never necessary nor certainly true. This has the effect of making claims of "scientific or statistical validity" open to interpretation as to what, in fact, the facts of the matter mean.

Validity is important because it can help determine what types of tests to use, and help to ensure researchers are using methods that are not only ethical and cost-effective, but also those that truly measure the ideas or constructs in question.

^ Brains, Willnat, Manheim, Rich 2011. Empirical Political Analysis 8th edition. Boston: Longman p. 105
^ Campbell, Donald T. (1957). "Factors relevant to the validity of experiments in social settings". Psychological Bulletin. 54 (4): 297–312. doi:10.1037/h0040950. ISSN 1939-1455. PMID 13465924.
^ Kelley, Truman Lee (1927). Interpretation of Educational Measurements. Yonkers-on-Hudson, NY: World Book Company. p. 14. The problem of validity is that of whether a test really measures what it purports to measure...
^ American Educational Research Association, Psychological Association, & National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, D.C.: American Educational Research Association.

[1] Brains, Willnat, Manheim, Rich 2011. Empirical Political Analysis 8th edition. Boston: Longman p. 105

[Campbell_1957_297–312-2] Campbell, Donald T. (1957). "Factors relevant to the validity of experiments in social settings". Psychological Bulletin. 54 (4): 297–312. doi:10.1037/h0040950. ISSN 1939-1455. PMID 13465924.

[3] Kelley, Truman Lee (1927). Interpretation of Educational Measurements. Yonkers-on-Hudson, NY: World Book Company. p. 14. The problem of validity is that of whether a test really measures what it purports to measure...

[4] American Educational Research Association, Psychological Association, & National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, D.C.: American Educational Research Association.

[1]

[2]

[3]

[4]